matrix

Series

DATA FRAMES

Missing Values

Data Frames

Missing Values

import numpy as np import pandas as pd from numpy.random import randn d= {"A":[1,2,np.nan],"B":[5,np.nan,np.nan],"C":[1,2,3]} dfg=pd.DataFrame(d)

Matplotlib

import matplotlib.pyplot as plt

Legends

fig = plt.figure() ax= fig.add_axes([0,0,1,1]) ax.plot(x, x2, label = 'x2') ax.plot(x,x3, label='x3') ax.legend(loc=0)

Seaborn

Grids

Plotly and cufflinks

Choropleth Maps/ geographical plots

Machine learning

import pandas as pd import numpy as np %matplotlib inline import matplotlib.pyplot as plt

Data processing tools

dataset = pd.read_csv('Data.csv')

Multiple Linear Regression

if varables are more than one

Y = b0+b1X1 + b2X2 + ....... + bnXn

Polynomial Regression

Y = b0 + b1*X

Y = b0 + b1X1 + b2X2 +..........+ bn*Xn

Y = b0 + b1X1 + b2X1^2 +..........+ bn*X1^n

Support Vector Regression

Feature scalling

not apply on dummy variable

not apply if Y is binary

Apply when dependent and independent variable are on different scale

Apply after train test split

Not apply when relation between x and y explicit like in linear regression

apply when relation between x and y explicit

Decision tree Regression

no need for feature scalling

Random Forest Regression

CLASSIFICATION

Logistic Regression

KNN K-Nearest Neighbor

assume the similarity between new case/data and available data. e.g it compare the picture

Support vector machine(SVM)

Linear SVM & Non linear SVM

Kernel SVM

Naive Bayes

Decision Tree Classification

Random Forest Classification

unsupervised Machine Learning

K-Mean Clustering

Hierarchical Clustering

Apriori Practical

Eclat

Natural Language Processing

DEEP LEARNING